Skip to content

[Clang] Fix __cpuidex conflict with CUDA #152556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 7, 2025

Conversation

boomanaiden154
Copy link
Contributor

The landing of #126324 made it so that __has_builtin returns false for aux triple builtins. CUDA offloading can sometimes compile where the host is in the aux triple (ie x86_64). This patch explicitly carves out NVPTX so that we do not run into redefinition errors.

The landing of llvm#126324 made it so that __has_builtin returns false for
aux triple builtins. CUDA offloading can sometimes compile where the
host is in the aux triple (ie x86_64). This patch explicitly carves out
NVPTX so that we do not run into redefinition errors.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Aug 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Aiden Grossman (boomanaiden154)

Changes

The landing of #126324 made it so that __has_builtin returns false for aux triple builtins. CUDA offloading can sometimes compile where the host is in the aux triple (ie x86_64). This patch explicitly carves out NVPTX so that we do not run into redefinition errors.


Full diff: https://github.com/llvm/llvm-project/pull/152556.diff

2 Files Affected:

  • (modified) clang/lib/Headers/cpuid.h (+5)
  • (modified) clang/test/Headers/__cpuidex_conflict.c (+1)
diff --git a/clang/lib/Headers/cpuid.h b/clang/lib/Headers/cpuid.h
index 52addb7bfa856..ce8c79e77dc18 100644
--- a/clang/lib/Headers/cpuid.h
+++ b/clang/lib/Headers/cpuid.h
@@ -345,10 +345,15 @@ static __inline int __get_cpuid_count (unsigned int __leaf,
 // In some configurations, __cpuidex is defined as a builtin (primarily
 // -fms-extensions) which will conflict with the __cpuidex definition below.
 #if !(__has_builtin(__cpuidex))
+// In some cases, offloading will set the host as the aux triple and define the
+// builtin. Given __has_builtin does not detect builtins on aux triples, we need
+// to explicitly check for some offloading cases.
+#ifndef __NVPTX__
 static __inline void __cpuidex(int __cpu_info[4], int __leaf, int __subleaf) {
   __cpuid_count(__leaf, __subleaf, __cpu_info[0], __cpu_info[1], __cpu_info[2],
                 __cpu_info[3]);
 }
 #endif
+#endif
 
 #endif /* __CPUID_H */
diff --git a/clang/test/Headers/__cpuidex_conflict.c b/clang/test/Headers/__cpuidex_conflict.c
index 74f45327de2bb..d14ef293e586d 100644
--- a/clang/test/Headers/__cpuidex_conflict.c
+++ b/clang/test/Headers/__cpuidex_conflict.c
@@ -5,6 +5,7 @@
 
 // Ensure that we do not run into conflicts when offloading.
 // RUN: %clang_cc1 %s -DIS_STATIC=static -ffreestanding -fopenmp -fopenmp-is-target-device -aux-triple x86_64-unknown-linux-gnu
+// RUN: %clang_cc1 -DIS_STATIC="" -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -aux-target-cpu x86-64 -fcuda-is-device -internal-isystem /home/gha/llvm-project/build/lib/clang/22/include -x cuda %s -o -
 
 typedef __SIZE_TYPE__ size_t;
 

@boomanaiden154
Copy link
Contributor Author

This is a specific carve-out which I'm not a huge fan of, but it doesn't seem like there's any systematic way to identify builtins tied to the aux triple, so I think this makes sense for now.

// In some cases, offloading will set the host as the aux triple and define the
// builtin. Given __has_builtin does not detect builtins on aux triples, we need
// to explicitly check for some offloading cases.
#ifndef __NVPTX__
Copy link
Member

@sarnex sarnex Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for the quick investigation!

I'm not really familiar with CUDA offloading but if the cc1 command is cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu, __has_builtin(__cpuidex) returning false seems correct, so I would think the actual root cause is that cpuidex is getting defined for CUDA with -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu.

If so, I'm happy to have this workaround for now but we might want to file a tracker somewhere for the root cause to be investigated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've filed #152558 to track this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks again!

@boomanaiden154 boomanaiden154 merged commit 6605551 into llvm:main Aug 7, 2025
13 checks passed
@boomanaiden154 boomanaiden154 deleted the cpuid-conflict-nvptx branch August 7, 2025 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants